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When the cosmic star formation history peaks (z ~ 2), galaxies vig¬ 
orously fed by cosmic reservoirs^’^ are gas dominated^’^ and contain 
massive star-forming clumps®’®, thought to form by violent gravi¬ 
tational instabilities in highly turbulent gas-rich disks^’®. However, 
a clump formation event has not been witnessed yet, and it is de¬ 
bated whether clumps survive energetic feedback from young stars, 
thus migrating inwards to form galaxy bulges®’^®’^^’^^. Here we re¬ 
port spatially resolved spectroscopy of a bright off-nuclear emission 
line region in a galaxy at z = 1.987. Although this region dominates 
the star formation in the galaxy disk, its stellar continuum remains 
undetected in deep imaging, revealing an extremely young (age < 
10 Myr) massive clump, forming through the gravitational collapse 
of > 10® Mq of gas. Gas consumption in this young clump is > 10 x 
faster than in the host galaxy, displaying high star formation effi¬ 
ciency during this phase, in agreement with our hydrodynamic simu¬ 
lations. The frequency of older clumps with similar masses^® coupled 
with our initial estimate of their formation rate (~ 2.5 Gyr”^) sup¬ 
ports long lifetimes (~ 500 Myr), favouring scenarios where clumps 
survive feedback and grow the bulges of present-day galaxies. 

The high spatial resolution and sensitivity of Hubble Space Telescope {HST) 
imaging and spectroscopy routinely allows us to resolve giant star-forming re¬ 
gions (clumps) inside galaxies at z ~ 2, three billion years after the Big Bang. 
Stellar population modelling has revealed a wide range of ages for clumps ob¬ 
served in the continuum®’^^’^®’^®, with average age ~ 100 Myr. Yet clump 
formation rates and lifetimes remain poorly constrained^^’^^’^^’^®. Continuum- 



based stellar ages are likely underestimated since clumps lose stars and re¬ 
accrete gas during their evolution®, while very young ages (< 30 Myr) cannot 
be probed with continuum imaging alone. High equivalent width (EW) emis¬ 
sion lines are required. 

We obtained 16 orbits of HST Wide Field Camera 3 G141 slitless spec¬ 
troscopy and imaging with the F140W, F105W and F606W hlters targeting a 
galaxy cluster at z = 2 The F606W band traces the star formation distri¬ 
bution in the UV rest-frame, while the F140W probes the optical rest-frame, 
reflecting the stellar mass distribution. Nebular [OIII]A5007A emission was de¬ 
tected for 68 galaxies with stellar masses 9.5 < log(M/M 0 ) < 11.5 and redshift 
1.3 < Zspec < 2.3, with measurements or upper limits for H/3, [OII]A3727A and 
Ha when available. From spatially resolved emission line maps we discovered 
a galaxy at z = 1.987 with a remarkably bright, off-nuclear emission line re¬ 
gion (F[oiii] = 4.3 ± 0.2 X 10“^^ erg cm“^ s“^, observed, plus H/3 and [OH]; 
Methods), lacking any obvious counterpart in broad-band imaging (Figure [^. 
The [OHI] emission is spatially unresolved (radius <500 pc) and located at 
the apparent distance of 1.6 ±0.3 kpc (offset signihcance 7.6(j, Methods) from 
the nucleus (i.e., the barycenter of the stellar continuum). The deprojected 
distance is constrained within 3.6 < d < 6.2 kpc, corresponding to 1.3 - 2.2 
times the galaxy half-light radius (Methods). Subtracting a point-like emis¬ 
sion leaves no signihcant residuals in the [OHI] map. The continuum reddening 
and mass-to-light ratio (M/L) maps are flat over the galaxy, excluding that 
the feature is artihcially induced by dust lanes or inhomogeneous attenuation 
(Extended Data [ED] Figure 1 ~ 2; Methods). 

From emission line ratios we estimated a reddening E(B — V) ~ 0.3 and 
a gas-phase metallicity Z ~ 0.4 Zq for this region, consistent with the host 
galaxy. Robust upper limits on its stellar continuum were estimated with de¬ 
tailed simulations, leading to remarkably high emission line EWs lower limits. 
Given these limits and the line luminosities, the emitting region cannot be 
powered by a massive black hole nor by shock ionisation from wind outflows. 
We similarly disfavor the hypothesis of a transient, since the line luminosities 
remain constant over time, or an ex situ merging system, since its older un¬ 
derlying stellar continuum would be detected^®. Also, this galaxy is classihed 
as a disk (not a merger) from its Asymmetry and M 20 parameters (Methods). 
Therefore, a young star-forming clump formed in situ is the most plausible in¬ 
terpretation. On the basis of stellar population synthesis modelling for galaxies 
with active star formation, the observed EWs require very young ages for the 
star formation event, with a hrm upper limit of 10 Myr (Figure]^. Thus, while 
the ubiquity of clumps in high-z galaxies has been known for a decade, we are 
witnessing here for the hrst time the formation of a star-forming clump in the 
early stage of its gravitational collapse. From the reddening corrected line 
luminosities we estimate a clump star formation rate SFR = 32 ± 6 MQyr“^, 
comparable to the rest of the host galaxy disk (Methods). The F140W con¬ 
tinuum non-detection translates into a stellar mass limit M* < 3 ■ 10® Mq. To 
infer the underlying gas mass of the clump we considered the Jeans mass of 
the galaxy as a plausible upper limit, as fragmentation at higher masses is un- 


likely. This constrains the clnmp gas mass to Mgas ^ 2.5 ■ 10® Mq, assnming a 
maximal gas velocity dispersion ay 80 km s 121,22 (^Methods). 

This hnding offers new insights into the physics of clump formation in gas- 
rich turbnlent media at high-z. Using the estimate of its nnderlying gas mass, 
stellar mass and SFR we can constrain the natnre of its star formation mode. 
Its specific star formation rate (sSFR = SFR/M*) is > 30 times higher than 
that of its host galaxy, a typical Main Seqnence (MS) galaxy at z ~ 2. Similarly, 
the lower limit on the clnmp star formation efficiency (SFE = SFR/Mgas) is 
> 10 times higher than that of normal galaxies (Fignrej^, a behavionr that at 
galaxy-wide scales is only observed for extreme starbnrsts^^. At snb-galactic 
scales snch a high SFE is observed for nearby molecnlar clonds^*^, which are 
small and transient featnres a thousand times less massive than the present 
clump. Possibly at odds with what has been assumed so far^^’^^’^®, this provides 
observational evidence that giant clumps do not follow the Schmidt-Kennicutt 
law of normal star-forming galaxies, at least in the early stages of collapse. 
Instead, this luminous sub-galactic structure appears to follow the universal 
star formation law normalized by the dynamical time^^’^®’^^. Comparing with 
the SFRs reported for older clumps with similar masses^^, we estimated a SFR 
enhancement of ~ 3 - 5 at “peak formation” with respect to later phases. This 
is the first observation of a massive star-forming clump with a robust stellar 
age estimate that is similar to or shorter than its dynamical time (Methods). 

Prompted by our observations, we investigated the properties of clumps in 
their formation phase using high resolution simulations®. We solved the dark 
matter, stellar and gas gravity and hydrodynamics at a resolution of 3.5 pc, gas 
cooling down to 100 K, and we modelled the feedback processes from young 
stars onto the gas: photo-ionization, radiation pressure and supernovae ex¬ 
plosions. Figure shows a typical, M* ~ 3 X 10^® Mq, z = 2 galaxy model 
with giant clumps formed through violent disk instability. Their formation 
sites are located at 2.1 - 7.0 kpc from the nucleus of the galaxy that has an 
half-mass radius of 4.5 kpc, consistent with many other simulations^®’^®’^® and 
our observations. All clumps, and especially the youngest ones, are brighter in 
the SFR map than in the continuum: they undergo a burst of star formation 
during their initial collapse, with peak SFRs about 10 — 20 MQyr“^ consis¬ 
tent with our observations, then evolve to a lower sSFR regime within 20 Myr, 
once feedback regulates star formation and their stellar mass has grown. Our 
simulations further corroborate the idea that all clumps behave like galactic 
miniatures of starbursts in the Schmidt-Kennicutt diagram during their first 
20 Myr (Figure [^. The SFE of simulated clumps decreases at later times, 
although it remains > 0.5 dex higher than MS galaxies, consistent with their 
shorter dynamical times. The presence of massive clumps is probably an effec¬ 
tive reason for the observed rise of the SFE in normal MS galaxies from z = 
0 to z = 2®’*^’^®, given the increasing prevalence of clumps at high-z. Further¬ 
more, the violent burst-like behaviour that young clumps show at formation 
is consistent with simulations predicting that, thanks to their rapid collapse, 
giant clumps could form globular clusters by converting gas into stars faster 
than stars expel the gas®®. 


The short visibility window at high EWs (< 10 Myr independently of the 
star formation history) has likely prevented until now the detection of the 
clumps formation phase. From this timing constraint and the single discovery 
in our survey, we attempted a first estimate of the clump formation rate of 
2.5 Gyr“^ per galaxy (for Mdump ^ 2.5 x 10® Mq, Methods). Given the obser¬ 
vation of 1 - 2 clumps per galaxy with similar masses^^’^^’^®, this converts into 
a lifetime of ~ 500Myr (Methods). This is longer than expected in models 
of clump destruction by stellar feedback^®’^®. Instead, it is representative of 
the timescale needed for giant clumps formed in galactic disks to migrate in¬ 
ward through dynamical friction and gravity torques and coalesce to grow the 
central galactic bulges®’^^. 

Our study demonstrates the detectability of ultra-young clumps in deep 
surveys, indicating low formation rates and long lifetimes. This is crucial to 
understand key issues of galaxy formation and evolution such as clumps migra¬ 
tion, bulge formation and the role of feedback. However, future observations of 
larger samples of forming clumps with direct measurements of clumps’ sizes, 
gas masses and velocity widths (and hence dynamical masses) are required 
for a dehnitive understanding. This should be within the capabilities of the 
complete Atacama Large Millimeter Array and James Webb Space Telescope. 
We note that spectroscopic surveys targeting high-z galaxies (e.g. SINS, 3D- 
HST) have not yet reported the identihcation of giant clumps at formation. 
This might suggest that they are rarer events than what appears from our 
survey, which finally allowed us to identify a direct signature of massive clump 
formation via gravitational collapse. 
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Figure 1: A massive, very young clump in a disk galaxy at z = 1.987. The 
emission line maps show off-nuclear, unresolved, bright [OIII] together with 
H/d and [Oil] emissions (a - c), respectively IScr, 3 ct and 3.5a signihcant. No 
counterpart in the direct images is detected (d - f). The flux contours of 
the [OIII] map have been overplotted on the direct images. The color scales 
logarithmically with flux from the minimum (black) to maximum (white) level 
displayed (different for a - c, d - f). The black cross in each panel indicates 
the barycenter of the stellar optical rest-frame continuum and the white circle 
the PSF FWHM. 
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Figure 2: Constraints on the clump’s age from reddening corrected, rest-frame, 
emission line EWs. Lower limits on the clump EW (black solid line) of [OIII], 
H/3, [on] (a - c) and the ratio between the H/3 luminosity and the contin¬ 
uum at 1500 A (d) are compared with theoretical tracks. A Salpeter initial 
mass function is assumed and different star formation histories (SFHs) are 
compared (single burst, constant star formation rate, and a SFH predicted by 
simulations®. Figure 4). The effect of reddening (AE(B-V) = -1-0.1) is indi¬ 
cated in each panel (red arrow). The age of the clump is constrained to be 
< 10 Myr. 
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Figure 3: Schmidt-Kennicutt plane. We compare the trends for starbursts 
and MS galaxies^® (black solid lines and shaded region indicating the 0.2 dex 
dispersion of the MS) with the location of observed (red hlled circle with s. 
d. error bars) and simulated clumps (colored dots connected with lines; solid 
and dashed lines for ages < 30 Myr and > 30 Myr, respectively). The average 
location of simulated clumps (red, thick line) and host simulated galaxy (square 
with s. d. error bars) are shown. Sudden variations in clumps’ Mgas are likely 
due to the accretion of gas-rich clouds or small clumps. 







Figure 4: Numerical simulations of a high-redshift clumpy galaxy seen face-on. 
Maps of stellar mass and SFR are shown at HST-like resolution (a, b). All 
clumps have elevated SFR compared with M*, but this property is extreme 
for clump A, observed 12 Myr after its formation. Panel c shows the time 
evolution of the SFR and M* for clump A and for all the clumps (each star 
formation history is arbitrarily shifted in time to align the SFR peaks). All 
clumps experience an internal burst of star formation before evolving into a 
long-lasting regulated regime within 20 Myr. Yellow shaded regions indicate 
s.d. uncertainties. 







METHODS 


Emission line maps 

The 16 HST /WFC3 orbits of G141 slitless spectroscopy, taken along three 
position angles (~ 0, -30, -|-15 degrees)^®, were reduced with aXe^h Residual 
defects (bad pixels, cosmic ray hits, etc.) were removed with L.A.Cosmic^^. 

Two dimensional-spectra were background subtracted with SExtractor^^, and 
the continuum emission of the main target and surrounding sources (includ¬ 
ing higher and lower order dispersion spectra) were removed htting their aXe 
continuum models with free normalization (Figure 1). 

Astrometrically calibrated emission line maps were obtained by cross-correlating 
the spectral images of [OIII] (the brightest line) with the three different posi¬ 
tion angles. This is preferred than cross-correlating with the continuum image 
since our target has different broad-band and line morphologies. For this step, 
the spectral images were combined with the IRAF task WDRIZZLE^^, weight¬ 
ing each single orientation by its exposure time. The astrometry of the H/3 and 
[Oil] emission maps was tied to that of [OIII]. The resulting redshift of agrees 
accurately with Subaru/MOIRCS longslit spectroscopy^^. 

The [OIII] doublet is resolved at the spectral resolution of our data for 
relatively compact galaxies. We removed the [0III]A4959A component mod¬ 
elling the combined emission line images with GALFIT^® using an effective 
point-spread function (PSF) consisting of a main lobe for the 5007A line and 
three fainter ones. 

Clump continuum emission 

Visual inspection of the multi-band HST imaging did not reveal any evidence 
of the clump, and the evaluation of their isophotal contours did not show dis¬ 
turbances at its location. Thus we searched for its presence modelling the 
imaging with GALFIT (Extended Data [ED] Figure 3). A single Sersic^^ pro- 
hle provided a simplihed £t, leaving strong positive and negative residuals 
near the expected position of the clump. Such a pattern is a systematic effect 
due to the presence of clumps at the outskirts of the galaxy major axis, as 
they are not symmetrically located with respect to the nucleus, resulting in an 
effective bending of the galaxy isophotes. Masking the external regions and 
fitting the central part of the galaxy with a single Sersic profile left negligible 
residuals (< 5%). As a further check, we htted the direct images with the 
Multi-Gaussian Expansion parametrization (MGE) algorithm^®, fitting aver¬ 
age azymuthal light profiles with ellipsoidal isophotes to the central part of 
the galaxy. The residuals are negligible (< 5%). Analogous residuals resulted 
using three spatially offset Sersic profiles: one centered at the barycenter of the 
stellar light (as determined by SExtractor from the F140W image) and other 
two, an order of magnitude fainter, to the top left and bottom right. This is 
our best fit (baseline) model for the galaxy continuum. 

This three-component £t is a technical solution adopted due to the irreg¬ 
ular morphology of our target, typical of clumpy high-z disks, and should not 
mislead to concluding that the galaxy is an ongoing merger. In this regard. 



we classified the galaxy as a disk based on the Asymmetry and M 20 parame¬ 
ters measured on stellar mass maps derived from pixel-to-pixel spectral energy 
distribution (SED) £tting^®’^°’^^ (ED Figure 4), a diagnostics calibrated with 
MIRAGE numerical simulations"^^. Finally, the F105W/F140W ratio (ED Fig¬ 
ure 2) provides no evidence for a bulge. 

Limits on the clump continuum were obtained with simulations, injecting 
PSF components at approximately the same isophotal level of the expected 
clump position, and htting them together with our baseline model. From these 
estimates we subtracted the contribution of emission lines ([OIII] and H/3 for 
F140W, [Oil] for F105W), obtaining a factor of 2 (1.3) deeper flux upper lim¬ 
its for F140W (F105W). Normalizing a series of Starburst99 stellar population 
synthesis models^^ with different stellar ages to the most constraining (F105W) 
upper limit allowed us to rehne the F140W limit, which is relevant for calcu¬ 
lating the [OIII] and II/3 emission line EWs (ED Figure 5). 


Clump offset from the galaxy nucleus 

The clump is offset from the galaxy center: the observed distance between the 
point-like [OIII] emission and the barycenter of the galaxy is 1.6 kpc, with for¬ 
mally negligible measurement error. However, systematic uncertainties exist, 
related to the astrometric calibration of the direct imaging and slitless data, 
and to the stability of the wavelength solution. We estimated the systematic 
uncertainties along the dispersion direction evaluating the distribution of dif¬ 
ferences between the measured and expected wavelengths of bright emission 
lines (Ho and [OIII]) of the full survey data. Comparing the position of galax¬ 
ies in the direct imaging with that of the continuum emission in the grism 
data we evaluated the systematics in the cross-dispersion direction. For each 
orientation of the grism we imposed: 


2 _ 

Tred ~ AT / y 

Ndof ^ 
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( 1 ) 


where Ndof are the degrees of freedom, Smeas,i and Sexp are respectively the mea¬ 
sured and expected positions of the emission lines (or the continuum) of each 
galaxy and ap and cta indicate respectively the formal measurement errors on 
the emission lines (or continuum) positions and the astrometric uncertainties. 
Average systematic uncertainties are <7^ = 0.067" (<7^ = 0.035") along the 
dispersion (cross-dispersion) direction. We computed the uncertainties along 
the right ascension and declination directions projecting along the orientation 
of each dataset. Since the final, astro metrically calibrated emission line maps 
are the weighted average of three different orientations, we estimated the total 
uncertainties assuming that the errors (e^) in each orientation are independent: 

where ti are the exposure times. The clump offset is detected at 7.6a and 
its projected distance from the galaxy nucleus (dehned as the barycenter of 


( 2 ) 



the stellar light) is 1.6 ± 0.3 kpc. If we had chosen the light peak of the 
direct images as the nucleus, the offset would be comparable in magnitude 
and signihcance. We prefer the light barycenter definition as it coincides with 
the peak of the mass map (ED Figure 2). 

To determine the deprojected distance, the axial ratio of the galaxy and 
the angle 6 between the galaxy major axis and the clump-nucleus direction are 
needed. We estimated them from the range of solutions obtained modelling 
the direct images and the mass map with GALFIT and considering the outer 
isophotes of the PSF-deconvolved galaxy. To further account for systematic 
effects we also considered plausible uncertainties in the PSF derivation, and 
further estimates based on the MGE software as an alternative to GALFIT. 

Given an axial ratio 0.21 < q < 0.35 (inclination i ~ 70 - 78°) and 48 < 6^ < 

52°, we computed a maximum plausible range for the deprojected distance of 
the clump from the nucleus of 3.6 < d < 6.2 kpc, beyond the galaxy effective 
radius Re = 2.8 ± 0.4 kpc (ED Table 1). We did not accounted for the disk 
thickness: this uncertain correction could imply larger deprojected distance by 
10 - 15% (for a typical thickness of a few hundreds of pc). 

Dust reddening 

Estimating emission line luminosities and SFRs requires dust extinction correc¬ 
tions (this is less relevant, though, for emission line EWs, affected only by the 
differential line versus continuum reddening). We used stellar population mod¬ 
elling of the UV-to-NIR galaxy SED^^, assuming the Galzetti et aD^^ reddening 
law and constant SFHs to measure the stellar continuum reddening. We con¬ 
verted this measure into nebular reddening using E(B — V)nebuiar = E(B — V)continuum/0.83^®, 
obtaining E(B — V)nebuiar = O.OOIq;))®. Independent estimates of the nebular 
reddening were also obtained based on emission line ratios: (i) Hq;/H/ 3, as¬ 
suming case B recombination conditions'^; (ii) [OIIj/Ho;, assuming an intrin¬ 
sic ratio of and (hi) [OII]/H/3, with intrinsic ratio estimated following the 
previous points. For these estimates we used Ho; fluxes from MOIRGS, [Oil] 
from WFG3 and H/3 from the weighted average of MOIRGS and WFG3, ob¬ 
taining: E(B — V)Ha/H /3 = 0.24 ± 0.12; E(B — V)[oii]/Ha = 0.32 ± 0.11; 

E(B — V)[oii]/H /3 = 0.40 ± 0.25. The average of these estimates is nearly iden¬ 
tical to that from the stellar continuum. We therefore adopt E(B — V)nebuiar = 

0.30. 

For the clump a reddening estimate can be obtained using WFG3, from the 
ratio of the [Oil] and H/3 line fluxes. We derived a fairly noisy measurement 
that is consistent with that of the whole galaxy (E(B — V)[oii]/H/ 3 ,ciump = 0.24 ± 0.37). 

While formally this is also consistent with zero attenuation towards the clump, 
this is unlikely as the galaxy is highly inclined. To improve the estimate of the 
reddening affecting the clump, we attempted a derivation of the Ho; flux of the 
clump in the MOIRGS data, decomposing the 2D spectrum with a PSF-like 
component for the clump and a single Sersic profile accounting for the host 
galaxy disk, finding Hq;= 7 ± 2 x 10“^^ erg s“^ cm“^, ~ 50% of the galaxy 
Ha emission^®. Averaging the reddening estimates from Ha/H/3, Ha/[OII] 
and [OII]/H/d, we obtained E(B — V)nebular,dump = 0.55 ± 0.20, consistent with 



the reddening of the host galaxy. We thus assumed that the clump nebu¬ 
lar reddening is identical to that of the parent galaxy, consistently with the 
literature^®. ED Figure 2 shows the observed F606W/F105W ratio, probing 
the stellar continuum reddening, which is homogeneous over the galaxy. The 
optical attenuation (Ay) at the clump position is similar to that at the galaxy 
nucleus within 0.1 - 0.2 mag, and close to the galaxy average. The position of 
the galaxy nucleus (measured as the light barycenter, light peak, or with GAL- 
FIT) is stable and not changing with the wavelength from F606W to F105W 
and F140W. [OIII] and F105W continuum should be affected by a similar 
attenuation and much less than the F606W continuum. Together with the 
flatness of the reddening map, this demonstrates that the clump emission lines 
are not an artifact due to reddening modifying the galaxy nucleus position, 
as an even stronger effect would be seen in F606W. Reddening correcting the 
emission line maps and the imaging does not signihcantly alter the nucleus- 
clump distance. Adopting the Cardelli et al.^® extinction law (see, e.g., Steidel 
et al.^^) would produce reddening values < 15% higher, consistent within the 
uncertainties. 

Discarding the AGN, shock, transient, and low-metallicity region 
hypotheses 

The galaxy has three Chandra photons (1 soft and 2 hard; ~ 2a detection) 
in 146 ks data giving L 2 - lokev ~ 2.9 X 10"^^ erg s ^ (photon index T = 1.8). 
This is 10 times higher than expected from the galaxy star formation^^. If an 
active galactic nucleus (AGN) were present, it would produce^^ an [OIII] lu¬ 
minosity ~ 20 times fainter than that of the clump. In ED Figure 6 both the 
entire galaxy and the clump are located in the BPT^^ diagram (we conserva¬ 
tively use [Nlljciump ^ [Nlljgaiaxy)- The emissiou line ratios are consistent with 
star-forming galaxies at z ~ 2^L The [OIII]/[NII] < 2.8 upper limit is also 
much lower than typically observed in Type 1 AGNs^®. The high EW further 
disfavors the hypothesis of an off-nuclear AGN, since AGNs typically have 
EWfoiii] < 500 A . Besides, no AGN signature was found from the galaxy 
SED, and no excess possibly arising from nuclear accretion is detected in our 
deep 24 /rm-Spitzer, Herschel and VLA data. 

The clump emission line luminosity is comparable with that of the whole 
galaxy, hence it cannot be due to shock from external outflows impacting the 
gas. The host SFR would generate ~ 30 times weaker galaxy-integrated, shock- 
excited line luminosities^^. The brightest shock powered off-nuclear clouds in 
local IR luminous galaxies are > 50 times weaker^®. Explicit calculations^® for 
z = 2 galaxies using appropriate wind mass loads®®’®^ and velocities®^ lead to 
analogous conclusions. The kinetic energy available in winds cannot account 
for the clump line luminosities. 

There is not evidence for substantial line luminosities variability over a 
~ Syr timescale. HST/WFC3 G141 spectroscopy was obtained in June and 
July 2010, and MOIRGS spectroscopy in April 2013 (ED Table 2). Despite 
their lower resolution (0.6" seeing), MOIRGS spectra show the bright, compact 
[OIII] and Ha emissions from the clump, with a consistent flux. 



Low-mass (< 10® Mq), very metal poor galaxies (Z ~ 0.1 Z©) can display 
extremely high EW emission lines®^. Our target is substantially more massive 
and metal rich: using the [OIII]/[OII] ratio we estimated®^ Z ~ 0.4±0.1 Zq 
and Z ~ 0.6 ± 0.2 Zq, for the clump and galaxy, respectively (ED Figure 4, ED 
Table 1). 

Constraining the age of the clump 

We computed the time evolution of the H/3 EW using stellar population syn¬ 
thesis models^^, adopting Z = 0.4 Zq, a Salpeter®^ initial mass function (IMF), 
and three different SFHs: an instantaneous burst, constant star formation and 
a SFH obtained from our hydrodynamic simulations (Figure 3). All mod¬ 
els show high EWs at young ages (log(EW) > 2, independent of the SFH), 
which drop quickly for the instantaneous burst and more smoothly in the 
other cases. We converted the H/3 EW into the expected [OHI] and [OH] EWs 
(Figure 2), assuming the [OHI]/H/3 ratio of z = 2 star-forming galaxies^^ and 
an Ha/[OH] luminosity ratio of Comparing with the EW lower limits we 
inferred an age < 10 Myr for the clump (Figure 2). 

The directly measured continuum upper limits (rather than the more strin¬ 
gent ones from synthetic spectra; ED Figure 5), give an age < 15 Myr. De¬ 
creasing the adopted E(B-V) reddening by 0.1 dex would reduce the line EW 
lower limits by ~ 0.05 dex only, but increasing the Lh/j/Lisoo limit by ~ 0.1 dex, 
hence with hardly any effect on the age constraints. Changing the metallicity 
by 1.6 dex produces a 0.2 dex age difference only. Similarly, the age remains 
unchanged adopting e.g., a Kroupa®® or a Scalo®^ IMF, and a top-heavy one 
produces EWs 0.2 dex higher. 

SFR estimate 

The SFR of the whole galaxy was determined from the total Ha luminosity 
from the MOIRCS spectroscopy, assuming the standard Kennicutt conversion®®, 
resulting in 77 ± 9 Mq yr“^, in agreement with that from SED htting (~ 85 Mq yr“ 
with an uncertainty of 0.2 dex, ED Table 1). 

The line luminosity to SFR time dependent conversion at young ages was 
computed using Starburst99 adopting the SFH from our numerical simulations 
(ED Figure 7). At t = 10 Myr this is 20% higher than Kennicutt. Averaging 
the estimates from H/3, [OH] and Ha we obtained SFR = 32 ± 6Mq yr“^ for 
the clump, where the error includes the uncertainties associated with emission 
line luminosities and reddening. 

Stellar mass estimate 

Assuming the average mass-to-light ratio (M/L) of the host galaxy (ED Fig¬ 
ure 2), the flux upper limit on the continuum emission of the clump implies 
M* < 3 ■ 10® Mq. Using the M/L ratio from the clump SFH (ED Figure 7) 
gives M* < 2.1 ■ 10® Mq. Normalizing the simulations to the observed H/3 lu¬ 
minosity yields M* ~ 3.9 ■ 10® Mq, consistent with the previous estimates given 
the uncertainties (a factor ~ 2, mainly due to the gas fraction of simulated 
galaxies and the details of feedback and stellar mass loss modelling at small 



scales; note that the simulated clumps appear to be slightly less massive than 
our observed one, on average, but their observed physical properties and time 
behaviour is self-similar). 


Gas mass estimate 

We inferred an upper limit to the gas mass in the clump from the Jeans mass 
(Mj) of the galaxy, which is close to the maximum gas mass that can collapse 
in a rotation disk^’®’®®’^'^. Assuming a reasonable upper limit for the typical gas 
velocity dispersion in high-z disk galaxies {ay < 80 km/s^^’^^), we obtained 
Mgas < Mj = 2.5 ■ 10® Mq. Using the Mgas/H/? ratio from simulations leads 
to a gas mass of Mgas ~ 2.7 ■ 10® M©. Comparing with older clumps from the 
literature^^, using our numerical simulations to relate the physical properties 
at the “peak” and later phases, yields: 
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where SFRut and refer to older clumps reported in the literature. M*^oid,sim 
and SFRoid,Sim are for old clumps in the simulations, while Mgas,young,sim and 
SFRyoung,sim are computed at t = 10 Myr, as our young clump. This approach 
leads to Mgas = 3 ■ 10® Mq ± 0.2 dex, consistent with the independent esti¬ 
mates discussed above. This agreement supports SFHs with an initial burst 
as predicted by our simulations. 

The Schmidt-Kennicutt relation can be used to provide alternative esti¬ 
mates, based on the clump SFR. The relation for MS galaxies would imply 
that ~ 50% of the total gas in the galaxy is collapsing in an ultra-compact 
region, which appears hardly believable, conhrming that this young clump 
has higher SFE. Assuming instead the starburst-like relation, we obtained 
Mgas = 2 ■ 10 ^’|)° 23 dex^ Mq, cousisteut with the previous estimates. 

Considering an even younger age for the clump, as permitted by the upper 
limit t < 10 Myr, would return higher absolute and specihc SFRs (ED Figure 
7; i.e. higher SFR/Lh/?) conhrming the starburst behaviour of the clump dur¬ 
ing the formation phase. 


Dynamical time estimate 

Measuring the FWHM of the Ha line detected in the MOIRCS longslit spec¬ 
troscopic data, we determined a hrst upper limit on the gas velocity of the 
clump vpwHM ^ 450kms“^ (the MOIRCS instrumental resolution). Given 
the upper limits on the radius of the clump (R < 500 pc) and on its dynami¬ 
cal mass (Mdyn = Mgas + M* < 2.8 ■ 10® Mq), we then rehned our estimate to 
vpwHM < A/Mdyn G/R ~ 200kms“^ (G is the gravitational constant), consis¬ 
tent with clump velocities typically observed in high-redshift galaxies^^. This 
leads to a dynamical timescale tdyn = 27rR/(vFWHM/2) ~ 29 Myr, reasonably 
in agreement with the free-fall time of the clump tg A/R^/Mdyn ~ 17 Myr. 

Clump formation rate and lifetime 

The visibility window of the young phase can be dehned as the time during 



which the EW is above a given threshold, as predicted by stellar population 
synthesis models. For our clump this ranges between ~ 5 Myr (instantaneous 
burst) and ~ 10 Myr for SFH from simulations. We used an average visibility 
window of 7 Myr. 

Knowing the visibility window and the observed number of “formation 
events” per galaxy, the “clump formation rate” can be estimated and compar¬ 
ing it to the average number of descendants observable per galaxy (virtually 
all old) yields the average lifetime of the clumps. 

We considered all galaxies in our survey with; (i) M* > 8.5 ■ 10® Mq (mass 
completeness, coinciding with the minimum mass that a galaxy should have to 
host such a massive clump assuming a gas fraction of 50%); (ii) M*<2-10i^M 
([OIII] emission becomes too weak at higher masses^^); and (iii) a redshift 
1.2 < z < 2.4 ([OIII] emission lying inside the wavelength range of the grism). 
57 galaxies are selected in this way. With one “forming clump” detected, this 
corresponds to a “clump formation rate” of 2.5 Gyr“^ per galaxy. 

We considered that in our survey we would have detected all formation 
events of clumps with Mgas ^ 2.5 ■ 10® M©. Considering that almost all the 
initial gas mass of a given clump is consumed at initial stages to form stars, 
the clump stellar mass at late stages can be approximated to the gas mass at 
initial collapse, independently of the age of the clump, as supported by our 
numerical simulations. Typically there are ~ 1 - 2 clumps per galaxy above 
such mass threshold^^’^^’^®, giving an average lifetime of 500 Myr. 

To compute the (large) associated uncertainty, we considered (i) the Pois¬ 
son error associated to our single object discovery, (ii) the Poisson error for 
older clumps from the literature; and (iii) the visibility window uncertainty. 
The asymmetric la uncertainties that we inferred are -1-0.74 dex and -0.55 
dex. The lower-envelope of the la range of the lifetime estimate is not far 
from the upper range of lifetimes suggested by models in which clumps suf¬ 
fer from strong feedback (50 - 100 Myr). However, our estimate is likely a 
lower-limit. The derived lifetime could be affected by a “discovery bias”, since 
other high-redshift spectroscopic surveys (e.g., SINS, 3D-HST) have not yet 
reported the observation of a similar giant young clump. Furthermore, there 
are indications^® that our target galaxy is living in a gas-enriched environment 
which could also have anomalously increased the SFR and thus the “clump 
formation rate”. This suggests that the observation of a newly giant clump 
could be an even rarer event than what appears from our data, and that the 
true average clump lifetime could be longer than estimated here. 

Code availability 

The RAMSES code used to generate our simulations is available at: 
http: / / www.ics.uzh.ch/~teyssier/ramses 
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Extended Data Figure 1: GALFIT decomposition of the clump. The [OIII] 
map (a) and the model of the point source component for the clump (b) are 
shown. No strong residuals or artifacts are left after removal of the point source 
component (c). The position of the nucleus and of the clump are shown as 
crosses. 





Extended Data Figure 2: Images ratios and mass map. The ratio of 
F105W/F606W imaging in Fj, scale (a), a proxy for the dust reddening of 
the stellar continuum, and F140W/F105W imaging, sensitive to the M/L ra¬ 
tio (b), are shown. The position of the nucleus and the clump are shown as 
crosses. The maps show only small variations: the observed Fj, ratios for the 
nucleus (clump) positions are 1.34 (1.16) for F105W/F606W and 1.39 (1.25) 
for F140W/F105W. Galaxy-wide ratios are 1.27 and 1.37 for F105W/F606W 
and F140W/F105W, respectively. The mass map (c) is shown in units of 
logio(M 0 /pixel). 
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Extended Data Figure 3: Modelling of the galaxy light prohle. The F140W 
direct image and [OIII] emission line map (a, d, g and j), the GALFIT models 
(b, e, h, k) and the residuals (c, f, i, 1) are shown. The hrst row shows the 
single Sersic prohle solution, the MGE model is in the second row, and our 
baseline model (the sum of three Sersic prohles; blue crosses mark the addi¬ 
tional components) in the third row. The red cross indicates the barycenter of 
the stellar light and the green one marks the center of the [OIII] off-nuclear 
component. 














Extended Data Figure 4: The Asymmetry and M 20 morphological parameters 
as determined from the spatial distribution of the galaxy stellar mass. Pink 
and light blue triangles represent disks and mergers from MIRAGE numeri¬ 
cal simulations^^, respectively. The galaxy presented in this work (red hlled 
circle with s. d. error bars) is located in the typical region occupied by disk 
galaxies^^ while the vast majority of mergers have higher Asymmetry and/or 
M 20 parameters. We note that the hgure shows the same number of mergers 
and disks even if in optical samples mergers are expected to be a minority. 
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Extended Data Figure 5: Clump continuum flux upper limits. The observed 
flux upper limits estimated from simulations and GALFIT modelling in the 
three bands are shown as black filled circles. The black horizontal lines in¬ 
dicate the bandpass width of each Alter. Colored curves represent reddened 
Starburst99 stellar population synthesis models^^ with different ages (from 
8 to 20 Myr), normalized to the most stringent upper limit (F105W band). 
The corresponding upper limits in F140W and F606W obtained considering a 
spectrum with an age ~ 10 Myr are shown as grey filled circles. 













Extended Data Figure 6: Emission line diagnostics. The BPT diagram®'^ (a) 
shows that the emission line ratios of the whole galaxy and of the clump (red 
and light blue points with s. d. error bars) are consistent with being powered 
by star formation. The [Nil] upper limit and Ha emission of the whole galaxy 
are measured from the Subaru/MOIRCS longslit spectroscopy follow-up and 
the [Nil]/Ha upper limit for the clump is computed assuming the [Nil] of the 
whole galaxy. The metallicities of the whole galaxy and that of the clump have 
been determined from the [OHI]/[OH] ratio®^ (b). 
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Extended Data Figure 7: Time evolution of physical quantities based on the 
clump SFR(t) from our simulations. In panel a the peak of all the curves is 
normalized to 1 to highlight the time delay occuring between the peak of the 
SFR and of the luminosities Lha, L =, L % whereas in panel b they are 
normalized to 1 at t = 1 Gyr to stress the relative intensity of the observables 
at the peak and later phases. The vertical black dotted line indicates the upper 
limit on the age of the clump (t = 10 Myr). The units of the plotted quantities 
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Tables 


Extended Data Table 1: Properties of the galaxy and the clump. 

Notes: ^ The effective radius of the galaxy is the average of the Re obtained from a single Sersic profile 
fit in the F140W, F105W and F606W imaging. ^ The gas mass of the galaxy has been determined, given 
its SFR, as: Mgas = 9.18 + 0.83 log(SFR)^®. ^ The observed flux of the F140W, F105W and F606W direct 
images has been determined with GALFIT. We associated a standard uncertainty of 5%. 



Galaxy (ID568) 

Clump (Vycl) 

Right ascension [h m s] 

14:49:12.578 

14:49:12.575 

Declination [° ’ ”] 

+8:56:19.42 

±8:56:19.62 

Re [kpc] 

2.8 ±0.4“ 

< 0.5 

SFR [Mo/yr] 

77 ±9 

32 ±6 

log(M*/M0) 

10 3+0-2 

<8.5 

log(Mgas/M0) 

10.7 ±0.2'* 

<9-4 

Z [Zq] 

0.6 ±0.2 

0.4 ±0.2 

I^loTTi] [10 ^'^ergs ^cm 

10.4 ±0.7 

4.3 ±0.2 

[10“^'^ergs“^cm“^] 

1.5 ±0.8 

0.9 ±0.3 

^[oii] [10“^'^ergs“^cm“^] 

6.5 ± 1.7 

1.9 ±0.6 

TT'obs 
^ F140W 

jlO-20ej.gg-icm“2 A“^] 

67.5 ± 3.4“ 

< 1.1 

TT'obs 
^ F105W 

jlO-20ej.gg-icm“2 A“^] 

89.2 ±4.6“ 

< 1.8 

TT'obs 

^ F606W 

jlO-20ej.gg-icm“2 A“^] 

212.3 ± 10.6“ 

< 4.5 



Extended Data Table 2; HST /WFC3 and Subaru/MOIRCS observations. 


Instrument 

Date 

Time 

(direct imaging) 
(hr) 

Time 

(spectroscopy) 

(hr) 

HST/WFC3 

2010, 6*^ June 

0.3 (F140W) 

2.7 

HST/WFC3 

2010, 25*^ June, 1"* July 

0.6 (F140W) 

7 

HST/WFC3 

2010, 9^^ July 

0.3 (F140W) 

2.7 

HST/WFC3 

2013, 20*^ May 

3.3 (F105W) 

- 

HST/WFC3 

2013, 20*^ May 

0.3 (F606W) 

- 

Subaru/MOIRCS 2013, 7*^ - 9*^^ April 

- 

7.3 


7.3 





